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INTRODUCTION 


Theories useful for developing mathematical models or for 
designing control systems are, for the most part, pertinent to 
well-defined systems, i.e., those for which a valid model 
structure is available and for which parameter values can be 
accurately specified. As Young (1978) has pointed out, 
strategies for building models of well-defined systems are 
rarely (or never) suitable for application to poorly-defined 
systems in which uncertainties in measurements, model structure 
and parameter estimates are likely to exert a dominant influence. 
Similar constraints apply to the application of control theory 
to poorly-defined systems. Conventional methodologies cannot 
be readily used to solve a variety of important problems that 
fall into the category of "poorly-defined systems." 

Problems in the ecological sciences are often poorly- 
defined (in the sense of our use of the term) . This fact may 
be attributed to a variety of reasons. Biological processes 
and complex chemical reactions that take place in these 
systems are not well understood, at least in quantitative 
terms. Data are limited in quantity and quality and non- 
stationarity is the rule rather than the exception. Never- 
theless, the ultimate goal of many efforts relating to modelling 
ecological systems is to develop a firm basic understanding 
of processes and an ability to control these systems. 

Sensitivity analysis is a term descriptive of a range of 
methods that can be used to address the general problem of 
modelling and control of ecological systems. 
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We report here a brief literature review of the use of 
sensitivity analyses in modelling ecological systems and on 
recent work of the authors in collaboration with R. C. Spear 
on the development of a generalized sensitivity analysis 
procedure. 

PREVIOUS WORK ON SENSITIVITY ANALYSIS 

In the analysis of ecological systems, including closed 
or partly closed micro ecosystems, there is no alternative to 
utilizing some type of simulation model as the mathematical 
format into which assumptions regarding causal relations and 
parameter values are summarized. By simulation model we mean 
one whose structure and parameters are explicitly related to 
physical, chemical or biological processes. Data in the 
literature on algal growth rates as a function of nutrient 
level, for example, are often given in terms of Michaelis 
constants, a fact which points out that simulation models are 
constrained to be written in the language of the various 
disciplines which have studied the component processes of the 
system. This constraint immediately leads to the result that 
most simulation models will be complex with many parameters, 
state variables and nonlinear relations. Under the best of 
circumstances such models have many degrees of freedom and, 
with judicious fiddling, can be made to produce virtually 
any desired behavior, often with both plausible structure 
and parameter values. Because of this problem, simulation 
modelling has limited importance in cases where understanding 
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of functional relationships is fragmentary and where extensive 
data sets that quantify the system behavior are lacking. 

In spite of the problems cited above the potential utility 
of information yielded by simulation models in planning 
experiments has been recognized. For example, with reference 
to ecological models, Jeffers (1972) states that 


"much time can be saved in the early stages 
of hypothesis formulation by the exploration 
of these hypotheses through mathematical 
models. Similarly mathematical models can 
be used readily to investigate phenomena from 
the viewpoint of existing theories, by the 
integration of disparate theories into a 
single working hypothesis, for example. Such 
models may quickly reveal inadequacies in 
the current theory and indicate gaps where 
new theory is required." 


Similarly, Mar (1974) in his review of multidisciplinary ' 
modelling studies pointed out that 

"The strategy to construct models without 
data and then employ sensitivity analysis 
to identify critical components where 
research and new data would enhance model 
performance is not commonly practiced." 

Stenseth (1977) , while roundly criticizing simulation modelling, 
admits that a simple model, when used to explore or to 
generate hypotheses, can be a valuable research tool. 

The use of parameter sensitivity in models of ecological 
systems has typically been for the purpose of analyzing 
system responses (e.g., Waide and Webster 1976; Wolaver 1980). 
These efforts are oriented, for the most part, toward linear 
systems models and thus to broad generalities in ecology and 
not to specific problems. These particular applications are 
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thus not strictly pertinent to analysis of regenerative life 
support systems. 

Several workers (e.g., Adams 1972; Haddock 1973; McCuen 
1976; Meyer 1972) have suggested that parameter sensitivity 
analysis can be used to guide future data collection efforts 
and/or to order research priorities. Such techniques might 
be useful in conducting research on poorly-defined systems. 
Traditional parameter sensitivity analysis, however, pertains 
to a particular point in the parameter space (the vector space 
spanned by all possible combinations of parameter values) . 

This requires that point estimates of all parameters be 
available which in turn, for complex ecological models, implies 
that sufficient input-output data for strenuous model cali- 
bration exist, and this is an xinrealistic assxmption for eco- 
system simulation. 

This problem of an inherent inability to specify the 
"nominal" values of parameters has significant implications 
in terms of control of ecological systems in general and of 
regenerative life support systems in particular. For example, 
O'Neill et al. (1980) deduce from a sensitivity analysis of 
a nonlinear ecological model that small parameter errors yield 
significant errors in trajectories of state variables. Similar 
conclusions can be drawn from other work (e.g.. Beck et al. 
1979; Fedra et al. 1980; Halfon 1979). This indicates that 
control schemes for poorly-defined systems must be robust 
in the sense of not depending upon precise and accurate 
estimates of parameter values. We will develop below a robust 
technique — referred to as a generalized sensitivity analysis 
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for overcoming the problem cited above but first we present a 
formal treatment of a class of modelling problem. 

Mulholland and Sims (1976) have proposed a means of 
solving large-scale dynamical optimization problems. Their 
technique, as applied to the problem of regenerative life 
support system (RLSS) control, can be formalized as follows; 

Let 

X = f (x(t) , V(t) , P(t)) (1) 

be a model representation of a RLSS with state vector x 
simulating real system components X, V the set of controlled 
parameters, and P the set of uncontrolled parameters and 
environmental variables. Because it may be difficult to 
formulate control laws based upon this large-scale system, 
a new vector, y, of smaller dimension than x is defined. This 
reduced dimension vector serves as an indicator of the overall 
system performance, and can be related to x through some 
vector-valued function 

y = P_(x) . (2) 

For example, yj^,and Y 2 concentrations of oxygen 

and carbon dioxide in the RLSS. 

Next an equation is chosen to insure the good perfor- 
mance of y, 

y = 

where y is an auxiliary control vector. For example ij. may 
involve the use of auxiliary oxygen tanks and carbon dioxide 
scrubbers subject to the failure of biological control at the 
V level. The function g might be selected such that 
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. r < 0 if [O 2 I > 30% 

] > 0 if [ 02 ] < 20% 

Y 2 < 0 if [CO 2 ] > 3%. 

Differentiating equation (2) with respect to time, yields 

y= (d£(x)/dx) • (dx/dt) = (dp^(x)/dx) • f(x,V,P). 

From equations (2) and (3) 

y = g(p.(x) / M.) ♦ 

The control law, V(t) = can therefore be calculated 

from the equality 

g(p^(x),_y) = (d£/dx) • f(x,V,P). (4) 

The implementation of the above outlined control scheme 
requires the development of a model to simulate the RLSS 
behavior. Construction of such a model traditionally proceeds 
in three steps. 

First a scenario must be selected, that is, it must be 
decided what aspect of the system is to be modelled (e.g. 
energy flux, carbon flux, phosphorous flux, etc.). This 
decision will depend on the control goals and the practicality 
of measurement. 

Second, a model structure must be selected. This includes 
both a decision on the number of state variables and the form 
of interactions between variables. Unfortunately, no reliable 
means of objectively selecting a model structure exists and 
the modeller must therefore rely on experience and trial and 
error (Jorgensen 1979) . Often models of subsystems are 
developed and calibrated, but these calibrations are not 
always valid when the submodels are linked in a conglomerate 
(Jorgensen 1979) . 



A second approach is to start with a simple system and 
increase complexity until reliable simulation is achieved. 
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Jorgensen and Meyer (1977 and 1979) suggest that a new 
component should be added to a system only if it contributes 
significantly to the ecological buffering capacity (B), where 
B is defined as the change in system loading divided by the 
change in the state variable being added. Williams (1971) 
found that the reliability of his simulations of a cedar bog 
lake increased as he added more components and incorporated 
nonlinear interactions. 

Perhaps the most fervent argument in systems ecology is 
that between the proponents of linear models and nonlinear 
models. In this context, a linear model is any set of first 
order differential equations 

dx/dt = A x(t) + ^(t) (5) 

where the elements in the coefficient matrix A are independent 
of the state vector x. (They may, however, be time dependent.) 
The most attractive feature of linear models is that their 
behavior is well understood and techniques to analyze them 
already exist (Waide and Webster 1976) . ’ Patten (1975) has 
hypothesized that macro scale ecological interactions are 
intrinsically linear. However, it is generally conceded 
that ecosystems, at least on the fine scale, show nonlinear 
behavior. Nonlinear models can be linearized for a small 
envelope about the equilibrixam state (^) by using a truncated 
Taylor series expansion of the form: 

X a f(^) +X((6f/6X. |X.q) * {X-^~ X^^) ) 

where X = f(X) is the nonlinear model. Bledsoe (1976) is 


( 6 ) 
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strongly critical of the Taylor series approximation and 
linear ecosystem modelling in general claiming it to be a 
borrowed technique which "imposes the mathematics on the 
biology rather than letting the ecosystem itself dictate the 
way in which the model is to function." Nonetheless, much 
of the work in systems ecology is based on linear models 
(e.g.. May 1972, 1973, 1975; Wolaver 1980; Lewis 1977). 

Ulanowicz et al. (1978) attempted constructing an empirical 
model based on a fit to a quadradic polynomial. They then donducted 
stability and sensitivity analyses. Their attempts are no 
more reflective of the system biology than the models 
criticized by Bledsoe (1976) . Tiwari et al. (1978) on the 
other hand, have based their model almost exclusively on the 
underlying biology, using Michaelis-Menten, and terms, 
donor, recipient and third party controls, and stochastically 
varying parameters, forcing functions and initial conditions. 

The result is a very complex model which may be difficult 
to incorporate in the control scheme described above. 

The third step in developing a model is calibration. As 
indicated above, this is a problem of considerable difficulty 
for ecological systems. Often only qualitative data are 
available and even small measurement errors in the quantitative 
data can lead to wide discrepencies in the model. Several 
techniques have been proposed to overcome this problem. 

For some simple models or submodels , parameters can be 
estimated from a least squares fit to the real system. How- 
ever, as mentioned above, such calibrations may not be valid 
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when submodels are incorporated into the conglomerate. In 
addition, there may not be a unique solution to such a fit 
or, especially for more complex systems (many degrees of 
freedom) , the curve fitting may not reflect biological reality. 

In light of the discussion above we contend that simulation 
models for RLSS control can be useful only in a probabilistic 
context. That is, given the model and the inherent uncertainties 
in structure and parameter values the only meaningful analysis 
must focus on the probabilities of various behaviors. Most 
importantly, it must focus on the probable structures and 
parametric relations which appear consistent with that behavior 
which is associated with the desired characteristics of the 
system under consideration. One method for applying simulation 
models in a probabilistic context is to use Monte-Carlo 
techniques. (For example, see Tiwari and Hobbie (1976a, b) 
and Tiwari et al. (1978) for an application of Monte-Carlo 
simulation in ecological modelling.) The methodology developed 
below adjoins the notion of qualitative or semi-quantitative 
descriptors of the behavior of the system to Monte-Carlo 
simulation to obtain a usable technique for the analysis and 
control of poorly-defined ecological systems such as those of an RLSS. 

A GENERALIZED SENSITIVITY ANALYSIS 

a. Class of Mathematical Models . For clarity of exposition 
we restrict our attention to a specific class of models and 
introduce nomenclature which will be required subsequently . 

Assume the processes are to be modelled by a set of first 
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order ordinary differential equations. (Different mathematical 
structures can be dealt with in an analogous way) . Let these 
equations be given in the form: 
dx(t) 

— = x(t) = f [x(t) , K, £(t) ] 

where x(t) is the state vector and ^(t) a set of time variable 
functions which include input or forcing functions. The vector 
£ is a set of constant parameters described more fully below. 
Thus for ^(t) and x(o) specified, x(t) is the solution of 
the system of equations and is a deterministic or a stochastic 
function of time as determined by the nature of ^(t) . For 
simplicity of exposition, z(t) will be treated hereafter as a 
deterministic function of t. Under this assumption, there are 
two types of uncertainty with which we will deal: uncertainty 

in the model structure, i.e. in the functions, f, and un- 
certainty in the parameter values £. Different model structures 
would pertain to competing hypotheses on system functioning 
(e.g., phosphorus limitation ^ nitrogen limitation in a 
eutrophication problem) ; we use the term scenario to indicate 
a particular structure. 

For a given scenario each element of the vector ^ is 
defined as a random variable the distribution of which is a 
measure of our uncertainty in the 'real' but unknown value 
of the parameter. These parameter distributions are formed 
from data available from the literature and from experience 
with similar structures. For example, the literature suggests 
that the maximum growth rate of Chtorella vulgaris is almost 
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certainly between 1.5 and 2.5 days ^ at water temperatures 
near 25°C. Interpreting these limits as the range of a 
rectangularly distributed random variable, and forming similar 
a priori estimates for the other elements of £, results in the 
definition of an ensemble of models for a given scenario. 

Some of these models will, we hope, mimic the real system with 
respect to the behavior of interest. 

b. The Problem-Defining Behavior . Turning now to the question 
of behavior, recall that for a given scenario every sample 
value of £, drawn from the a priori distribution, results in 
a unique state trajectory, x(t) . Following the usual practice, 
we assume that there are a set of observed variables ^(t) , 
calculable from the state vector which are important to the 
problem at hand. So, for each randomly chosen parameter £*, 
there corresponds a unique observation vector (t) . Since 
the elements of ^(t) are observed (that is, we assume that 
they are measured in the real system) it is sensible to define 
behavior in terms of ^(t) . For example, suppose y^|^ is the 
concentration of phytoplankton in a body of water and the 
problem in question concerns unwanted algal blooms due to 
nutrient enrichment. Then there is some value of y^^ above 
which a bloom is defined to have occurred and the behavior 
is defined by this critical value. 

In general a number of behavior categories can be used. 
Without loss of generality, however, we can consider the case 
for which behavior is defined in a binary sense, that is, it 
either occurs or does not occur for a given scenario and set 
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of parameters It follows that a rule must be specified 
for determining the occurrence or non-occurrence of the 
behavior on the basis of the pattern of ^(t) . It is also 
possible that the behavior might depend on the vector z_(t) . 

For example, suppose one element of £(t) were water temperature. 
We might be interested only in extreme values of ^(t) when 
adjusted or controlled for temperature variations. In any 
event, the detailed definition of behavior is problem- 
dependent and, for present purposes, it is sufficient to keep 
in mind that a set of numerical values of £ leads to a unique 
time function ^^(t) which, in turn, determines the occurrence 
or non-occurrence of the behavior conditioned, perhaps, by £(t) . 

c. The Analysis Procedure . We have now presented the class 
of models to be studied, defined the scenario concept and 
described how we propose to deal with parametric uncertainty. 

For a given scenario, behavior and set of parameter distri- 
butions it is possible to explore the properties of the 
ensemble via computer simulation studies. In particular, a 
random choice of the parameter vector ^ from the predefined 
distributions leads to a state trajectory x(t) , an observation 
vector y^(t) and, via the behavior-defining algorithm to a 
determination of the occurrence or non-occurrence of the 
behavior. A repetition of this process for many sets of 
randomly chosen parameters results in a set of sample para- 
meter vectors with which the behavior was observed and a set 
for which the behavior was not observed. The key idea is then 
to attempt to identify the subset of physically, chemically 
or biologically meaningful parameters which appear to account 
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for the occurrence or non-occurrence of the behavior. More 
traditional sensitivity analyses of large ecological models 
inevitably show that a surprisingly large fraction of the 
total number of parameters is simply unimportant to the critical 
model behavior. We maintain that this unimportant subset or 
conversely the critical subset, may be tentatively specified 
rather early in any study. 

Ranking the elements of £ in order of importance in the 
behavioral context is accomplished through an analysis of 
the Monte-Carlo results. The essential concept can be best 
illustrated by considering a single element, of the vector 

£ and its a priori distribution as shown in Figure 1. Recall 
that the procedure is to draw a random sample from this 
parent distribution (a similar procedure is followed for all 
other elements of £) , run the simulation with this value and 
record the observed behavior and the total vector £ therewith 
associated. A repetition of this procedure results in two 
sets of values of one associated with the occurrence of 
the behavior B, and the other with not the behavior, B. That 
is, we have split the distribution into two parts as 

indicated in the figure. This particular example would 
suggest that was important to the behavior since 

is clearly divided by the behavioral classification. Alternatively, 
if the sample values under B and B appeared both to be from 
the original distribution then we would conclude that 

was not important. 
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d. Sensitivity Ranking of Parameters , For the case where . 

£(t) is a deterministic function of time, the parameter space 
is clearly divided by the behavioral mapping; that is, there 
is no ambiguity regarding whether a given parameter vector 
results in B or B. Our analysis then focuses on the description 
of the region of parameter space associated with the behavior 
and our aim is to delineate what parameters or combinations of 
parameters are most important in distinguishing between B and 
B. The hypersurface dividing the parent space cannot usually 
be determined analytically for environmental systems because 
of model complexity and a statistical analysis of the Monte- 
Carlo results must be used to make inferences regarding 
sensitivity rankings. In general, all of the moments of the 
distribution under the behavioral classification are necessary 
to describe completely the shape of the two subspaces, but, 
as with similar types of problems in the field of pattern 
recognition, examination of the first two moments should be 
sufficient in practical application. 

We will restrict the discussion to the case for which 
the parameter vector mean is zero and the parameter covariance 
matrix is the identity matrix. (A suitable transformation 
can always be found to convert the general problem to this 
case.) The problem of identifying how the behavior mapping 
separates the parent parameter space can then be approached 
by examining induced mean shifts and induced covariance 


structure. 
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For example, considering only shifts in the mean and 
variance of the individual parameters, we can base a sensitivity 
ranking on a direct measure of the separation of the cumulative 
distribution functions, F(Cj^|b) and P(5 j^|b) . In particular, 
we utilize the statistic 

= T I • ®m'*> I 

where and are the sample distribution functions correspond- 
ing to F(5 j^|b) and F(5 j^|B) for n behaviors and m non-behaviors. 
The statistic d_ ^ is that used in the Kolmogorov - Smirnow 
two sample test and both its asymptotic and small sample 
distribution are known for any continuous ctamulative distribution 

function F(5. ). Since S and S are estimates of F(C, |b) and 
iC n in jc 

F(5. |B) we see that d is the maximum vertical distance 
between these two curves and the statistic is, therefore, 
sensitive not only to differences in central tendency but to 
any difference in the distribution functions. Thus, large 
values of d indicate that the parameter is important for 

• iH/n 

simulating the behavior and, at least in cases where induced 
covariance is small, the converse is true for small values 
of that statistic. 

In general, however, ranking on the basis of the separation 
in the distribution functions along the original axes of the 
parameter space (the individual parameter values) is not 
sufficient. It is possible, for example, that the first and 
second moments for a single parameter might exhibit no 
separation and yet this parameter could be crucial to a success- 
ful simulation by virtue of a strong correlation with other 
parameters under the behavior (see Fig. 2) . The induced 
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covariance structure must therefore be included in a general 
sensitivity ranking. (This point is discussed fully by 
Hornberger and Spear 1980b) . 

e. Extension to Control System Design . There is an obvious 
appeal to the notion of extending the sensitivity concept to 
the problem of controlling systems that are parametrically 
ill-defined. The most straight-forward extension to the 
control problem is to consider the design of a controller that 
will deliver a high probability of adequate performance under 
the uncertainty in knowledge of the process parameters mani- 
fested by these a priori distributions. Here the binary 
classification notion of the sensitivity approach is retained 
in the form of adequate or inadequate system performance. 
Moreover, since this performance is to be based on the simu- 
lation results it can be defined in very practical terms and 
requires only an algorithmic definition rather than an 
analytically tractable formulation. 

The simplest approach to controller design would appear 
to involve the specification of one or more candidate controller 
structures together with a set of control parameters for each 
structure. Each parameter set would then be assigned a 
distribution of allowable values and the problem is to select 
from within this set of allowable values the one specific set 
of control parameter values that maximize the probability 
of adequate performance P(B). Then, the controller structure 
with the highest P(B) is the best of the candidates with the 
particular value of P(B) allowing the designer to decide if 
the risk can be accepted and the design implemented or greater 
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knowledge of the process will be needed. 

The procedure has been successfully applied to two 
problems in the control of poorly-defined ecological systems 
during the past two months by the authors in collaboration 
with Professor Robert C. Spear of the University of California. 
One of these problems deals with control of water quality in 
a river and the other with control of a biological waste 
treatment plant. Details of this work will be included in 
the later reports on this contract. 
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